Goto

Collaborating Authors

 Parkinson's Disease


CASCADE Conformal Prediction: Uncertainty-Adaptive Prediction Intervals for Two-Stage Clinical Decision Support

arXiv.org Machine Learning

Effective medication management in Parkinson's Disease (PD) is challenging due to heterogeneous disease progression, variable patient response, and medication side effects. While AI models can forecast levodopa equivalent daily dose (LEDD) as a measure of medication needs, standard uncertainty quantification often fails to communicate the reliability of these predictions, treating high and low confidence clinical decisions identically. We introduce CASCADE (Calibrated Adaptive Scaling via Conformal And Distributional Estimation), a novel conformal prediction framework that propagates epistemic uncertainty from a screening classifier to adapt downstream predictions. Unlike standard conformal methods that rely on auxiliary residual regression, we leverage epistemic uncertainty from a primary classification task (identifying whether a medication change is needed) to dynamically scale the prediction intervals of a secondary regression task (predicting how much change). By mapping Venn-Abers multi-probabilistic uncertainty directly to non-conformity scores, our framework achieves continuous risk adaptation. We demonstrate that this ``cascade effect'' produces highly efficient intervals for confident patients (38.9% narrower than standard conformal baselines) while automatically expanding intervals to ensure robust coverage for uncertain cases, bridging the gap between discrete clinical decision-making and continuous dose forecasting in PD.



Y ouTubePD: A Multimodal Benchmark for Parkinson's Disease Analysis Supplementary Material

Neural Information Processing Systems

We include all our annotations and extracted landmarks. This ensures that we uphold the highest standards of ethical data usage. In Table A1, we summarize the severity label distribution in Y ouTubePD. We also summarize the demographic distribution in Y ouTubePD, split between PD-positive and healthy control (HC), or PD-negative, subjects. This decision is based on the clinician's suggestion, since an accurate UPDRS facial expression rating would require more This strategy also allows for a finer classification.






Predicting Parkinson's Disease Progression Using Statistical and Neural Mixed Effects Models: Comparative Study on Longitudinal Biomarkers

arXiv.org Machine Learning

Predicting Parkinson's Disease (PD) progression is crucial, and voice biomarkers offer a non-invasive method for tracking symptom severity (UPDRS scores) through telemonitoring. Analyzing this longitudinal data is challenging due to within-subject correlations and complex, nonlinear patient-specific progression patterns. This study benchmarks LMMs against two advanced hybrid approaches: the Generalized Neural Network Mixed Model (GNMM) (Mandel 2021), which embeds a neural network within a GLMM structure, and the Neural Mixed Effects (NME) model (Wortwein 2023), allowing nonlinear subject-specific parameters throughout the network. Using the Oxford Parkinson's telemonitoring voice dataset, we evaluate these models' performance in predicting Total UPDRS to offer practical guidance for PD research and clinical applications.


YouTubePD: A Multimodal Benchmark for Parkinson's Disease Analysis

Neural Information Processing Systems

The healthcare and AI communities have witnessed a growing interest in the development of AI-assisted systems for automated diagnosis of Parkinson's Disease (PD), one of the most prevalent neurodegenerative disorders. However, the progress in this area has been significantly impeded by the absence of a unified, publicly available benchmark, which prevents comprehensive evaluation of existing PD analysis methods and the development of advanced models. This work overcomes these challenges by introducing YouTubePD -- the first publicly available multimodal benchmark designed for PD analysis. We crowd-source existing videos featured with PD from YouTube, exploit multimodal information including in-the-wild videos, audio data, and facial landmarks across 200+ subject videos, and provide dense and diverse annotations from clinical expert. Based on our benchmark, we propose three challenging and complementary tasks encompassing both discriminative and generative tasks, along with a comprehensive set of corresponding baselines. Experimental evaluation showcases the potential of modern deep learning and computer vision techniques, in particular the generalizability of the models developed on YouTubePD to real-world clinical settings, while revealing their limitations. We hope our work paves the way for future research in this direction.


Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations

arXiv.org Artificial Intelligence

Abstract-- Perceptual voice quality assessment plays a vital role in diagnosing and monitoring voice disorders. Traditional methods, such as the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) and the Grade, Roughness, Breathiness, Asthenia, and Strain (GRBAS) scales, rely on expert raters and are prone to inter-rater variability, emphasizing the need for objective solutions. This study introduces the Voice Quality Assessment Network (VOQANet), a deep learning framework that employs an attention mechanism and Speech Foundation Model (SFM) embeddings to extract high-level features. To further enhance performance, we propose VO-QANet+, which integrates self-supervised SFM embeddings with low-level acoustic descriptors--namely jitter, shimmer, and harmonics-to-noise ratio (HNR). Unlike previous approaches that focus solely on vowel-based phonation (PVQD-A), our models are evaluated on both vowel-level and sentence-level speech (PVQD-S) to assess general-izability. Experimental results demonstrate that sentence-based inputs yield higher accuracy, particularly at the patient level. Overall, VOQANet consistently outperforms baseline models in terms of root mean squared error (RMSE) and Pearson correlation coefficient across CAPE-V and GRBAS dimensions, with VOQANet+ achieving even greater performance gains. Additionally, VOQANet+ maintains consistent performance under noisy conditions, suggesting enhanced robustness for real-world and telehealth applications. This work highlights the value of combining SFM embeddings with low-level features for accurate and robust pathological voice assessment. This work was supported in part by the National Science and T ech-nology Council and Academia Sinica. Whenty Ariyanti is with the Department of Computer Science and Information Engineering, National T aiwan University of Science and T echnology, T aipei 106, T aiwan, and also with the Research Center for Information T echnology Innovation, Academia Sinica, T aipei 11529, T aiwan (e-mail: d11115805@mail.ntust.edu.tw). Kuan-Y u Chen is with the Department of Computer Science and Information Engineering, National T aiwan University of Science and T echnology, T aipei 106, T aiwan (e-mail: kychen@mail.ntust.edu.tw). Sabato Marco Siniscalchi is the University of Palermo, Palermo, Italy (e-mail: sabatomarco.siniscalchi@unipa.it).